Building a Gold Standard for Thai WordNet
نویسندگان
چکیده
This paper presents a method of building a gold standard test set of Thai WordNet. The results of this research can be utilised for evaluating or comparing the results from different approaches of Thai WordNet construction. In this research, a part of Thai WordNet is carefully handcrafted from Common Base Concepts’ FirstOrderEntities with five translation resources. However, we found that to build a gold standard test set is not easy as finding words that can fit to the definition of synsets; cultural gaps between the different languages have to be aware of.
منابع مشابه
BabelNet: Building a Very Large Multilingual Semantic Network
In this paper we present BabelNet – a very large, wide-coverage multilingual semantic network. The resource is automatically constructed by means of a methodology that integrates lexicographic and encyclopedic knowledge from WordNet and Wikipedia. In addition Machine Translation is also applied to enrich the resource with lexical information for all languages. We conduct experiments on new and ...
متن کاملWhat implementation and translation teach us: the case of semantic similarity measures in wordnets
Wordnet::Similarity is an important instrument used for many applications. It has been available for a while as a toolkit for English and it has been frequently tested on English gold standards. In this paper, we describe how we constructed a Dutch gold standard that matches the English gold standard as closely as possible. We also re-implemented the WordNet::Similarity package to be able to de...
متن کاملSemantic Similarity Measures for the Development of Thai Dialog System
Semantic similarity plays an important role in a number of applications including information extraction, information retrieval, document clustering and ontology learning. Most work has concentrated on English and other European languages. However, for the Thai language, there has been no research about word semantic similarity. This paper presents an experiment and benchmark data sets investig...
متن کاملThai WordNet Construction
This paper describes semi-automatic construction of Thai WordNet and the applied method for Asian wordNet. Based on the Princeton WordNet, we develop a method in generating a WordNet by using an existing bi-lingual dictionary. We align the PWN synset to a bilingual dictionary through the English equivalent and its part-of-speech (POS), automatically. Manual translation is also employed after th...
متن کاملSenseval-3 task: Word Sense Disambiguation of WordNet glosses
The SENSEVAL-3 task to perform word-sense disambiguation of WordNet glosses was designed to encourage development of technology to make use of standard lexical resources. The task was based on the availability of sensedisambiguated hand-tagged glosses created in the eXtended WordNet project. The hand-tagged glosses provided a “gold standard” for judging the performance of automated disambiguati...
متن کامل